Goto

Collaborating Authors

 Milpitas


Hardware-Adaptive and Superlinear-Capacity Memristor-based Associative Memory

He, Chengping, Jiang, Mingrui, Shan, Keyi, Yang, Szu-Hao, Li, Zefan, Wang, Shengbo, Pedretti, Giacomo, Ignowski, Jim, Li, Can

arXiv.org Artificial Intelligence

Brain-inspired computing aims to mimic cognitive functions like associative memory, the ability to recall complete patterns from partial cues. Memristor technology offers promising hardware for such neuromorphic systems due to its potential for efficient in-memory analog computing. Hopfield Neural Networks (HNNs) are a classic model for associative memory, but implementations on conventional hardware suffer from efficiency bottlenecks, while prior memristor-based HNNs faced challenges with vulnerability to hardware defects due to offline training, limited storage capacity, and difficulty processing analog patterns. Here we introduce and experimentally demonstrate on integrated memristor hardware a new hardware-adaptive learning algorithm for associative memories that significantly improves defect tolerance and capacity, and naturally extends to scalable multilayer architectures capable of handling both binary and continuous patterns. Our approach achieves 3x effective capacity under 50% device faults compared to state-of-the-art methods. Furthermore, its extension to multilayer architectures enables superlinear capacity scaling (\(\propto N^{1.49}\ for binary patterns) and effective recalling of continuous patterns (\propto N^{1.74}\ scaling), as compared to linear capacity scaling for previous HNNs. It also provides flexibility to adjust capacity by tuning hidden neurons for the same-sized patterns. By leveraging the massive parallelism of the hardware enabled by synchronous updates, it reduces energy by 8.8x and latency by 99.7% for 64-dimensional patterns over asynchronous schemes, with greater improvements at scale. This promises the development of more reliable memristor-based associative memory systems and enables new applications research due to the significantly improved capacity, efficiency, and flexibility.


A Statistical Analysis for Per-Instance Evaluation of Stochastic Optimizers: How Many Repeats Are Enough?

Noori, Moslem, Valiante, Elisabetta, Van Vaerenbergh, Thomas, Mohseni, Masoud, Rozada, Ignacio

arXiv.org Artificial Intelligence

A key trait of stochastic optimizers is that multiple runs of the same optimizer in attempting to solve the same problem can produce different results. As a result, their performance is evaluated over several repeats, or runs, on the problem. However, the accuracy of the estimated performance metrics depends on the number of runs and should be studied using statistical tools. We present a statistical analysis of the common metrics, and develop guidelines for experiment design to measure the optimizer's performance using these metrics to a high level of confidence and accuracy. To this end, we first discuss the confidence interval of the metrics and how they are related to the number of runs of an experiment. We then derive a lower bound on the number of repeats in order to guarantee achieving a given accuracy in the metrics. Using this bound, we propose an algorithm to adaptively adjust the number of repeats needed to ensure the accuracy of the evaluated metric. Our simulation results demonstrate the utility of our analysis and how it allows us to conduct reliable benchmarking as well as hyperparameter tuning and prevent us from drawing premature conclusions regarding the performance of stochastic optimizers.


Experimental Demonstration of an Optical Neural PDE Solver via On-Chip PINN Training

Zhao, Yequan, Xiao, Xian, Descos, Antoine, Yuan, Yuan, Yu, Xinling, Kurczveil, Geza, Fiorentino, Marco, Zhang, Zheng, Beausoleil, Raymond G.

arXiv.org Artificial Intelligence

Examples include electromagnetic modeling and thermal analysis of IC chips [1], medical imaging [2], safety verification of autonomous systems [3]. Discretization-based solvers (e.g., finite-difference and finite-element methods) convert a PDE to a large-scale algebraic equation via spatial and temporal discretization. Solving the resulting algebraic equation often requires massive digital resources and run times. Physics-informed neural network (PINN) is a promising discretization-free and unsupervised approach to solve PDEs [4]. PINN uses the residuals of a PDE operator and the boundary/initial conditions to set up a loss function, then minimizes the loss to train a neural network as a global approximation of the PDE solution. However, current PINN training typically needs several to dozens of hours on a powerful GPU, hindering the deployment of an real-time neural PDE solver on edge devices.


OpenAI whistleblower who died was being considered as witness against company

The Guardian

Balaji worked at OpenAI for nearly four years before quitting in August. He had been well-regarded by colleagues at the San Francisco company, where a co-founder this week called him one of OpenAI's strongest contributors who was essential to developing some of its products. "We are devastated to learn of this incredibly sad news and our hearts go out to Suchir's loved ones during this difficult time," said a statement from OpenAI. Balaji was found dead in his San Francisco apartment on 26 November in what police said "appeared to be a suicide. No evidence of foul play was found during the initial investigation."


Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

Esencan, Mert, Kumar, Tarun Advaith, Asanjan, Ata Akbari, Lott, P. Aaron, Mohseni, Masoud, Unlu, Can, Venturelli, Davide, Ho, Alan

arXiv.org Artificial Intelligence

Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to the LLM can alter its performance on a multitude of tasks, including reasoning, the best performing techniques require human-made prompts with the knowledge of the tasks at hand. We introduce a framework for what we call Combinatorial Reasoning (CR), a fully-automated prompting method, where reasons are sampled from an LLM pipeline and mapped into a Quadratic Unconstrained Binary Optimization (QUBO) problem. The framework investigates whether QUBO solutions can be profitably used to select a useful subset of the reasons to construct a Chain-of-Thought style prompt. We explore the acceleration of CR with specialized solvers. We also investigate the performance of simpler zero-shot strategies such as linear majority rule or random selection of reasons. Our preliminary study indicates that coupling a combinatorial solver to generative AI pipelines is an interesting avenue for AI reasoning and elucidates design principles for future CR methods.


Carbon Footprint Reduction for Sustainable Data Centers in Real-Time

Sarkar, Soumyendu, Naug, Avisek, Luna, Ricardo, Guillen, Antonio, Gundecha, Vineet, Ghorbanpour, Sahand, Mousavi, Sajad, Markovikj, Dejan, Babu, Ashwin Ramesh

arXiv.org Artificial Intelligence

As machine learning workloads significantly increase energy consumption, sustainable data centers with low carbon emissions are becoming a top priority for governments and corporations worldwide. This requires a paradigm shift in optimizing power consumption in cooling and IT loads, shifting flexible loads based on the availability of renewable energy in the power grid, and leveraging battery storage from the uninterrupted power supply in data centers, using collaborative agents. The complex association between these optimization strategies and their dependencies on variable external factors like weather and the power grid carbon intensity makes this a hard problem. Currently, a real-time controller to optimize all these goals simultaneously in a dynamic real-world setting is lacking. We propose a Data Center Carbon Footprint Reduction (DC-CFR) multi-agent Reinforcement Learning (MARL) framework that optimizes data centers for the multiple objectives of carbon footprint reduction, energy consumption, and energy cost. The results show that the DC-CFR MARL agents effectively resolved the complex interdependencies in optimizing cooling, load shifting, and energy storage in real-time for various locations under real-world dynamic weather and grid carbon intensity conditions. DC-CFR significantly outperformed the industry standard ASHRAE controller with a considerable reduction in carbon emissions (14.5%), energy usage (14.4%), and energy cost (13.7%) when evaluated over one year across multiple geographical regions.


Open Image Content Disarm And Reconstruction

Belkind, Eli, Dubin, Ran, Dvir, Amit

arXiv.org Artificial Intelligence

With the advance in malware technology, attackers create new ways to hide their malicious code from antivirus services. One way to obfuscate an attack is to use common files as cover to hide the malicious scripts, so the malware will look like a legitimate file. Although cutting-edge Artificial Intelligence and content signature exist, evasive malware successfully bypasses next-generation malware detection using advanced methods like steganography. Some of the files commonly used to hide malware are image files (e.g., JPEG). In addition, some malware use steganography to hide malicious scripts or sensitive data in images. Steganography in images is difficult to detect even with specialized tools. Image-based attacks try to attack the user's device using malicious payloads or utilize image steganography to hide sensitive data inside legitimate images and leak it outside the user's device. Therefore in this paper, we present a novel Image Content Disarm and Reconstruction (ICDR). Our ICDR system removes potential malware, with a zero trust approach, while maintaining high image quality and file usability. By extracting the image data, removing it from the rest of the file, and manipulating the image pixels, it is possible to disable or remove the hidden malware inside the file.


A Machine Learning based Empirical Evaluation of Cyber Threat Actors High Level Attack Patterns over Low level Attack Patterns in Attributing Attacks

Noor, Umara, Shahid, Sawera, Kanwal, Rimsha, Rashid, Zahid

arXiv.org Artificial Intelligence

Cyber threat attribution is the process of identifying the actor of an attack incident in cyberspace. An accurate and timely threat attribution plays an important role in deterring future attacks by applying appropriate and timely defense mechanisms. Manual analysis of attack patterns gathered by honeypot deployments, intrusion detection systems, firewalls, and via trace-back procedures is still the preferred method of security analysts for cyber threat attribution. Such attack patterns are low-level Indicators of Compromise (IOC). They represent Tactics, Techniques, Procedures (TTP), and software tools used by the adversaries in their campaigns. The adversaries rarely re-use them. They can also be manipulated, resulting in false and unfair attribution. To empirically evaluate and compare the effectiveness of both kinds of IOC, there are two problems that need to be addressed. The first problem is that in recent research works, the ineffectiveness of low-level IOC for cyber threat attribution has been discussed intuitively. An empirical evaluation for the measure of the effectiveness of low-level IOC based on a real-world dataset is missing. The second problem is that the available dataset for high-level IOC has a single instance for each predictive class label that cannot be used directly for training machine learning models. To address these problems in this research work, we empirically evaluate the effectiveness of low-level IOC based on a real-world dataset that is specifically built for comparative analysis with high-level IOC. The experimental results show that the high-level IOC trained models effectively attribute cyberattacks with an accuracy of 95% as compared to the low-level IOC trained models where accuracy is 40%.


X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs

Pedretti, Giacomo, Moon, John, Bruel, Pedro, Serebryakov, Sergey, Roth, Ron M., Buonanno, Luca, Ziegler, Tobias, Xu, Cong, Foltin, Martin, Faraboschi, Paolo, Ignowski, Jim, Graves, Catherine E.

arXiv.org Artificial Intelligence

Structured, or tabular, data is the most common format in data science. While deep learning models have proven formidable in learning from unstructured data such as images or speech, they are less accurate than simpler approaches when learning from tabular data. In contrast, modern tree-based Machine Learning (ML) models shine in extracting relevant information from structured data. An essential requirement in data science is to reduce model inference latency in cases where, for example, models are used in a closed loop with simulation to accelerate scientific discovery. However, the hardware acceleration community has mostly focused on deep neural networks and largely ignored other forms of machine learning. Previous work has described the use of an analog content addressable memory (CAM) component for efficiently mapping random forests. In this work, we focus on an overall analog-digital architecture implementing a novel increased precision analog CAM and a programmable network on chip allowing the inference of state-of-the-art tree-based ML models, such as XGBoost and CatBoost. Results evaluated in a single chip at 16nm technology show 119x lower latency at 9740x higher throughput compared with a state-of-the-art GPU, with a 19W peak power consumption.


Fulltime Cloud Architect openings in California on September 25, 2022

#artificialintelligence

The Cloud and BigData Software Architect is responsible for leading technical efforts related to modern data engineering, based on Cloud computing technologies. The candidate is expected to have demonstrated experience in the domain, with proven record of architecting data centric systems at Cloud scale, supporting complex use cases. Experience with public cloud environments, with a focus on the various data services and cloud native software architecture are important for the position. Effectively participate in defining, designing, architecting and implementing cloud based systems. Master new and emerging technologies, and effectively apply them into new systems. Be in the forefront of the happenings related to data.